Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
نویسندگان
چکیده
High-dimensional sparse linear regression is a basic problem in machine learning and statistics. Consider a linear model y = Xθ + w, where y ∈ R is the vector of observations, X ∈ R is the covariate matrix with ith row representing the covariates for the ith observation, and w ∈ R is an unknown noise vector. In many applications, the linear regression model is high-dimensional in nature, meaning that the number of observations n may be substantially smaller than the number of covariates d. In these cases, it is common to assume that θ is sparse, and the goal in sparse linear regression is to estimate this sparse θ, given (X,y). In this paper, we study a variant of the traditional sparse linear regression problem where each of the n covariate vectors in R are individually projected by a random linear transformation to R with m ≪ d. Such transformations are commonly applied in practice for computational savings in resources such as storage space, transmission bandwidth, and processing time. Our main result shows that one can estimate θ with a low l2-error, even with access to only these projected covariate vectors, under some mild assumptions on the problem instance. Our approach is based on solving a variant of the popular Lasso optimization problem. While the conditions (such as the restricted eigenvalue condition on X) for success of a Lasso formulation in estimating θ are well-understood, we investigate conditions under which this variant of Lasso estimates θ. The main technical ingredient of our result, a bound on the restricted eigenvalue on certain projections of a deterministic matrix satisfying a stable rank condition, could be of interest beyond sparse regression.
منابع مشابه
The lower tail of random quadratic forms, with applications to ordinary least squares and restricted eigenvalue properties
Finite sample properties of random covariance-type matrices have been the subject of much research. In this paper we focus on the “lower tail”’ of such a matrix, and prove that it is subgaussian under a simple fourth moment assumption on the onedimensional marginals of the random vectors. A similar result holds for more general sums of random positive semidefinite matrices, and the (relatively ...
متن کاملNear-Optimal Estimation of Simultaneously Sparse and Low-Rank Matrices from Nested Linear Measurements
In this paper we consider the problem of estimating simultaneously low-rank and row-wise sparse matrices from nested linear measurements where the linear operator consists of the product of a linear operatorW and a matrix Ψ . Leveraging the nested structure of the measurement operator, we propose a computationally efficient two-stage algorithm for estimating the simultaneously structured target...
متن کاملHigh-dimensional classification by sparse logistic regression
We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty ...
متن کاملNystrom Approximation for Sparse Kernel Methods: Theoretical Analysis and Empirical Evaluation
Nyström approximation is an effective approach to accelerate the computation of kernel matrices in many kernel methods. In this paper, we consider the Nyström approximation for sparse kernel methods. Instead of relying on the low-rank assumption of the original kernels, which sometimes does not hold in some applications, we take advantage of the restricted eigenvalue condition, which has been p...
متن کاملGeneralized Linear Model Regression under Distance-to-set Penalties
Estimation in generalized linear models (GLM) is complicated by the presence of constraints. One can handle constraints by maximizing a penalized log-likelihood. Penalties such as the lasso are effective in high dimensions, but often lead to unwanted shrinkage. This paper explores instead penalizing the squared distance to constraint sets. Distance penalties are more flexible than algebraic and...
متن کامل